TD Models: Modeling the World at a Mixture of Time Scales
نویسنده
چکیده
Temporal-diierence (TD) learning can be used not just to predict rewards, as is commonly done in reinforcement learning, but also to predict states, i.e., to learn a model of the world's dynamics. We present theory and algorithms for intermixing TD models of the world at diierent levels of temporal abstraction within a single structure. Such multi-scale TD models can be used in model-based reinforcement-learning architectures and dynamic programming methods in place of conventional Markov models. This enables planning at higher and varied levels of abstraction, and, as such, may prove useful in formulating methods for hierarchical or multi-level planning and reinforcement learning. In this paper we treat only the prediction problem|that of learning a model and value function for the case of xed agent behavior. Within this context, we establish the theoretical foundations of multi-scale models and derive TD algorithms for learning them. Two small computational experiments are presented to test and illustrate the theory. This work is an extension and generalization of the work Model-based reinforcement learning ooers a potentially elegant solution to the problem of integrating planning into a real-time learning and decision-Dean et al., in prep). However, most current reinforcement-learning systems assume a single, xed time step: actions take 1
منابع مشابه
Time Series Modeling of Coronavirus (COVID-19) Spread in Iran
Various types of Coronaviruses are enveloped RNA viruses from the Corona-viridae family and part of the Coronavirinae subfamily. This family of viruses affects neurological, gastrointestinal, hepatic, and respiratory systems. Recently, a new memb-er of this family, named Covid-19, is moving around the world. The expansion of Covid-19 carries many risks, and its control requires strict planning ...
متن کاملTD Models : Modeling the World at a Mixture of Time
Temporal-diierence (TD) learning can be used not just to predict rewards, as is commonly done in reinforcement learning, but also to predict states, i.e., to learn a model of the world's dynamics. We present theory and algorithms for intermixing TD models of the world at diierent levels of temporal abstraction within a single structure. Such multi-scale TD models can be used in model-based rein...
متن کاملPotentials of Evolving Linear Models in Tracking Control Design for Nonlinear Variable Structure Systems
Evolving models have found applications in many real world systems. In this paper, potentials of the Evolving Linear Models (ELMs) in tracking control design for nonlinear variable structure systems are introduced. At first, an ELM is introduced as a dynamic single input, single output (SISO) linear model whose parameters as well as dynamic orders of input and output signals can change through ...
متن کاملUnsteady-state Computational Fluid Dynamics Modeling of Hydrogen Separation from H2/N2 Mixture
3D modeling of Pd/α-Al2O3 hollow fiber membrane by using computational fluid dynamic for hydrogen separation from H2/N2 mixture was considered in steady and unsteady states by using the concept of characteristic time. Characteristic time concept could help us to design and calculate surface to volume ratio and membrane thickness, and adjust the feed conditions. The contribution of resistance be...
متن کاملMolecular Dynamics Modeling of Hypersonic Gas-Phase and Gas-Surface Reactions
Abstract. Efforts to use molecular dynamics (MD) to develop both non-equilibrium dissociation models required in the shock layer as well as gas-surface interaction models specifically for surface catalysis will be summarized. First, an accelerated MD algorithm for dilute gases is presented, called the Event-Driven/Time-Driven (ED/TD) MD method. The method detects and moves molecules directly to...
متن کامل